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Abstract. We consider the problem of generating hard instances for the Satisfying As- 
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Q\ ■ signment Search Problem (in short, SAT). It is not known whether SAT is difficult on 
average, while it has been believed that the Factorization Problem (in short, FACT) is 
hard on average. Thus, one can expect to generate hard-on-average instances by using 
a reduction from FACT to SAT. Although the asymptotically best reduction is obtained 
CN ! by using the Fast Fourier Transform [[5S71|1 (in short, FFT), its constant factor is too big 



in practice. Here we propose to use the Chinese Remainder Theorem for constructing 
efficient yet simple reductions from FACT to SAT. First by using the Chinese Remainder 
Theorem recursively, we define a reduction that produces, from n bit FACT instances, 
SAT instances in the conjunctive normal form with 0(n 1+e ) variables, where e > is any 
fixed constant. (Cf. The reduction using FFT yields instances with O(nlognloglogn) 



variables.) Next we demonstrate the efficiency of our approach with some concrete exam- 
ples; we define a reduction that produces relatively small SAT instances. For example, 
it is possible to construct SAT instances with about 5,600 variables that is as hard as 
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factorizing 100 bit integers. (Cf. The straightforward reduction yields SAT instances 



with 7,600 variables.) 
1. Introduction 



^ . The satisfiability problem (SAT) is a central problem in various fields of computer science. 

Precisely speaking, we consider the following "search problem" : For a given propositional 
Boolean formula, find an assignment of values to the propositional variables so that the 
formula evaluates to true. This paper investigates the way of generating hard SAT in- 
stances. (In this paper, we consider only "positive" instances, namely, satisfiable Boolean 
formulas. Also we consider only conjunctive formulas; a formula may be a k-conjunctive 
normal form formula, i.e., a conjunction of disjunctions of k (or less) literals, or it may 
be an k-extended conjunctive form formula, i.e., a conjunction of finite functions on k (or 
less) variables.) 

While it has been known that SAT is NP-hard, we do not knowQ so much about its 

1 There have been quite a lot investigations for solving SAT, and we have made important observations 
on the hardness of SAT (see, e.g., [{Joh96[| ) Nevertheless, our knowledge is far from satisfiable one. 
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concrete hardness. This contrasts to the factorization problem (FACT), i.e., the problem 
of computing the prime factorization of a given number. While we do not know whether 
FACT is NP-hard, we have developed some knowledge on its concrete hardness through 
the development of algorithms and various experimental attacks to the problem (see, 
e.g., |[LL90|| ). Here we propose an approach for measuring concrete hardness of SAT that 
uses an efficient reduction from FACT to SAT. Theoretically, it is clear that FACT is 
polynomial-time reducible to SAT, and that a SAT instance F generated from a FACT 
instance x is as hard as factorizing x. The goal of this paper is to design efficient reductions 
so that we can generate SAT instances with smaller size and higher hardness. 

There are two somewhat different motivations for designing efficient reductions. 

First, with such efficient reductions, we can generate hard SAT instances that could 
be used to test the performance of various heuristics for SAT. In general, it is not so easy 
to generate good test instances. On the other hand, it is easy to generate hard instances 
for FACT; just generate two large prime numbers and multiply them. Thus, with an 
efficient reduction from FACT to SAT, we can generate hard SAT instances easily. Also, 
from FACT instances, it is easy to generate SAT instances with a unique solution; thus, 
by negating the unique solution, we can easily generate "negative" SAT instances. (In 
general, "negative" instance generation is difficult ||A1M96| .) 

Secondly, with efficient reductions, we can analyze the concrete hardness of SAT. For 
example, it has been widely believed that factorizing the product of two 256 bit prime 
numbers is intractable. (In fact, even the degree of intractability has been discussed; see, 
e.g., ||Sch94| .) Thus, by reducing such hard FACT instances, we can estimate the concrete 
hardness of SAT. 

Because of these motivations, reductions we define must be efficient on a certain 
interval of size that we are interested in. Thus, a simple method is more appropriate 
than efficient but complicated methods. For example, by using the Fast Fourier Trans- 
form ( ||SS71|| ; see also ||Knu81|| ), one can define a reduction that yields formulas with 
0(^log^loglog£) variables from products of two £ bit prime numbers, which is asymptot- 
ically the best (so far). Unfortunately, however, this reduction is almost useless for our 
purpose due to its large constant factor. 

In this paper, we propose one method of defining reductions, which is based on the 
Chinese Remainder Theorem. Though simple, we show that this method gives us efficient 
reductions. First, we define a reduction that uses the Chinese Remainder Theorem recur- 
sively and yields formulas with 0(£ 1+e ) from products of two I bit prime numbers, where 
e is any small constant. Clearly, this is not the best compared with the one defined by 
using FFT. But because of its small constant factor, we may be able to use this reduction 
(or, the idea of the reduction) for generating relatively large instances, say, formulas with 
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100,000 variables. Next, we define a reduction that works for the case I < 500. For 
example, with this reduction, we can construct SAT instances with 5,600 variables that 
are as hard as factorizing products of two 50 bit prime numbers, which can be used as 
test instances [ |Joh96|| . (Cf. A naive reduction yields instances with 7,600 variables.) The 



same reduction also yields SAT instances with 63,000 variables that are as hard as factor- 
izing products of two 256 bit prime numbers. Thus, we can conclude that SAT instances 
with 63,000 variables contain some (in fact, many) intractable instances. (Cf. A naive 
reduction yields instances with 197,000 variables.) 

Notations 

Throughout this paper, we consider, for FACT instances, a product of two prime numbers 
of the same length, and we use I to denote their length (i.e., the number of bits). For any 
az_i, ao £ {0, 1}, we regard (a/_i, a ) as a binary representation of some number. In 
general, for any aj_i, a £ [0, b — 1], (a/_i, a ) is a base b representation of some 
number. ' 

2. Basic Idea and Asymptotic Analysis 

Here we first explain the basic idea of our method, and then discuss the way to apply it 
recursively to get an asymptotically better reduction. 

Our goal is to generate, for a given integer x = p x q, where p and q are I bit prime 
numbers, a SAT instance F x such that one can easily compute p and q from the satisfying 
assignment of F. In the following, let us fix this x and thus, p and q. Note that F x is 
defined for each x, and x can be embedded in the definition of F x as a constant. On the 
other hand, our construction must be independent from p and q; in other words, F x must 
be constructed without knowing p or q. (Otherwise, one may extract information on p or 
q from F x without solving F x .) 

For our goal, consider, for example, F£ xl that satisfies the following: 

[ (a/_i, a ) x (b t -i, b ) = x] <^> [ F^ xl (a e _ 1 , a , V-i, &o) = true ]. 

Here dj and &, are propositional variables, and we use them to represent nonnegative 
integers. The satisfying assignment of this F x xl is the binary representation of p and q, 
and thus, one can compute the factorization of x by solving SAT on F x xl . 

Here we take the following approach to generate F x : (i) First design a circuit C x , which 
we call a test circuit, such that C x (a,£-i, ao, bo) checks whether a£_i,...,a x 

be-i, ...,bo = x. (ii) Then convert it into a conjunctive form formula F x . In fact, there is 
a standard way to transform a circuit to a conjunctive form formula (see Lemma |3.1|) , by 
which we can construct a conjunctive form formula F x with the following property: 
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[ C x (ae-i, a , h-i, b ) = 1 ] (<^> [ (a*_i, a ) x (6^_i, 6 ) = x ]) 
<^> 3«i, u t [ F x (ae-i, o , V-i, 6 , «i> «t) = true ]. 

Clearly, this F x is also good enough for our purpose. Furthermore, the size of F x , i.e., the 
number of variables and clauses, are closely related to the number of gates of the circuit 
C x . Thus, our goal is now to design a test circuit C x with small number of gates. 

We can easily think of 0{£ 2 ) size circuit that multiplies two £ bit numbers, which 
gives a test circuit C£ aive of almost the same size. For the multiplication, asymptotically 
the best one (so far) is obtained by using the Fast Fourier Transform ( ||SS71j| ; see also 
Knu81|| ). By using this algorithm, we can design Cj FT with O (£log £ log log £) gates. 
Unfortunately, though, due to its large constant factor, the size of circuits (and thus 
formulas) obtained in this way become quite large in practice. 

In this paper, we construct test circuits based on the Chinese Remainder Theorem. 
Let mi, ...,mk be relatively prime numbers, and let m = mi ■ m2 ■ ■ -m^. The Chinese 
Remainder Theorem claims that for any x\, x^ such that < £j < m ; for each i, there 
exists unique y, < x < m, such that x modmj = Xi for all i, 1 < % < k. The following 
fact is immediate from this claim. 

Fact 1. For any x > of 2£ bit number, let m 1; m^ be relatively prime numbers such 
that m = mi ■ m 2 ■ ■ -m^ > 2 2e . For any p, q of £ bit number, p x q = x if and only if 
p x q = x (modmj) for all i, 1 < i < k. 

Let mi, be relatively prime numbers such that m = mi ■ m 2 ■ ■ -m^ > 2 2e for 

our x. (Recall x is the product of two £ bit prime numbers.) Then we may consider 
the following circuit C| x2 that checks whether u x v = x, for given two numbers u = 
(at-i, a ) and v = (bg-i, 6 )- 

(Step 1) For every i, 1 < i < k, compute Ui = u modmj and t?j = v modmj. (Also for 
every i, 1 < i < k, let Xj = x modmj. Note that these constants and we do 

not have to compute them.) 

(Step 2) For every i, 1 < i < k, check whether Ui x V{ = Xj (modmj). If all of them hold, 
then output 1; otherwise, output 0. 

Since the length of each U{ and v j is much smaller than that of u and v, we may expect to 
reduce the complexity of checking. Note, however, it is now necessary to compute each 
Ui and t> j, which is not so cheap in general. Also we need to compute Ui ■ t>j modulo mj. 

Here we use integers of the form 2 £i — 1 for each mj. Then we can reduce the cost of 
computing Uj, v j, and Ui ■ v j modmj. As explained below (Claim |I]), we can compute each 
Ui (resp., t>j) by some 0(£)-size circuit. Also it will be shown later (Claim [|) that the cost 
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of computing Ui ■ Vj mod mj is almost the same as that of ordinary multiplication; hence, 
this task can be done by 0(e 2 )-size circuit because both Ui and Vi are ei bit integers. 

Note also that the relative primality of 2 e — 1 and 2 e ' — 1 is coincide with e and e' (see 
Fact H below). Thus, we can use t\ = \£/2] and e 2 = \£/2] + 1. On the other hand, if we 
want to divide the checking into small pieces, we may choose the first k prime numbers 
for ei, e 2 , tk such that ei + e 2 + • ■ • + > £ + k (where +k is for some margin). In this 
case, we can bound k and by 0(efc/ loge^) and 0((£log£) 1/ ' 2 ) respectively, and thus, 
the size of the test circuit Cf 2 is bounded by 0(£ 3/2 (\og£) 1/2 ) ||Hor97|| . 

Fact 2. For any e, e' > 1, 2 e — 1 and 2 e ' — 1 are relatively prime if and only if so are e 
and e'. 

Now to get an asymptotically better bound, we consider applying the Chinese Re- 
mainder Theorem recursively. That is, we break down the test of Ui x Vi = Xj (modmj) 
yet further. Unfortunately, however, the characterization like Fact [I] does not hold in 
general. For example, while we have 12 x 12 = 20 (mod2 5 — 1), 12 = 5 (mod2 3 — 1), and 
20 = 6 (mod2 3 — 1), it does not hold that 5x5 = 6 (mod2 3 — 1). Here we extend Fact p] 
as follows. 

Fact 3. For any n > 1 of e bit number, let mi, ...,rrik be relatively prime numbers such 
that m = mi ■ m 2 ■ • ■ m k > 2 2e . Then for any u, v, and y, < u, v, y < n, we have u x v 
= y (modn) if and only if 

3w : < w < 2 2e 

w = y (modn) and A u x v = w (modmj). 

i<j<fc 

For any number y, and for any e such that y < 2 e , we define a circuit C^r that 
checks whether u x v = y (mod2 e — 1). (We will can be used ClS db test 

circuit.) Intuitively, for given u and f , we may consider that C^ e | achieves the following 
nondeterministic computation. 

Let ei, efc be relatively prime numbers such that (2 ei — 1) ■ ■ • (2 6fc — 1) > 2 2e . 
(Step 1) Guess w, < w < 2 2e , and check whether w = y (mod2 e — 1). 
(Step 2) For every i, 1 < i < k, compute = u mod (2 6i — 1), — v mod (2 ei — 1), and 
Wi = w mod (2 ei — 1). 

(Step 3) For every i, 1 < i < k, check whether Ui x Vi = Wi (mod2 6i — 1) by using C^ e .. 
If all of them hold, then output 1; otherwise, output 0. 

We consider that accepts u and v if it outputs 1 on some guess w. Formally, C™^ 
is a circuit with some additional input gates for w, and C x * c e accepts u and v if and only 
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if Cy e ^(u,v,w,w') = 1 for some w and w' . (Input w' is used for nondeterministic guesses 
in the recursive computation.) Then, it follows from Fact ||] that u x v = y (mod2 e — 1) 
holds if and only if C X yl(u, v, w, w') = 1 for some w and w'. 

In order to determine precisely, we need to define k and the way to select ei, e^. 
Here we define k = k(e) by using some unbounded but slowly increasing function k, e.g., 
k(e) = loge. For e\ < ... < e^, we choose the smallest k primes larger than (2e + k)/k. 
Then we have (2 ei — 1) ■ ■ ■ (2 £fc — 1) > 2 2e . It is easy to see that our choice of parameters 
yields a circuit achieving the desired test. 

Lemma 2.1. The size of C x ™ is 0(e 1+e ) for any e > 0. 

Proof. Here we fix any e > 0, and show that there exists some constant c such that 
size(C*^) < c ■ e 1+e for sufficiently large y and e. In the following discussion, let us also 
fix y and e. 

First we give an upper bound for computing ii mod (2-^ — 1) for a given u. Although 
results are from to 2? — 2, we allow to use 2? — 1, which is regarded as 0. Thus, 
the binary representation of is either (0,0, ...,0) or (1,1,. ..,1). We call this slightly 
relaxed way to represent numbers modulo 2^ — 1 as an extended binary representation. 
The notation «mod' (2-^ — 1) is used to denote wmod(2^ — 1) representing the extended 
binary representation. In order to distinguish from (1, 1, 1), we call (0, 0, 0) as the 
real representation. 

For our analysis, we need the following claims. (The claim proved as a special case of 
the corresponding one in Section 3. Thus, we omit its proof.) 

Claim 1. For any / > 1, we can construct a circuit MOD e j with the following properties. 

(1) MOD e j is an e input and / output circuit. 

(2) On input u, < u < 2 e - 1, MOD £i/ (w) yields wmod'(2 / - 1). Also the output 
becomes the real representation if and only if u = 0. 

(3) The size of MOD e j(«) is bounded by C\ ■ e for some constant c\. 

Now we show, by induction on e, that size(C^g) < c • e 1+e . From the outline of CI™, 
we have the following bound. 

k 

size(C- c ) = £ (size(C^) + 2size(MOD e , e J + size(MOD 2e>e j) + size(MOD 2e , e ) + k 
i=i 

k k 

< J2{ c ' e l +£ + 2ci • e + ci • 2e) + ci • 2e + A; < ^ c • e] +£ + c 2 • ke. 

i=l i=l 

Here the term +k is for the number of AND gates that summarize the check at (Stepl) 
and (Step3). 



6 



Recall that we assume that k is determined by a slowly growing function, and that 
e\ < e2 < ■ ■ ■ < Ck are the smallest k primes larger than (2e + k)/k. Hence, by using the 
Prime Number Theorem, we can bound by 3e/k (for sufficiently large e). Thus, we 
have 

size(C™) < ck ■ e\ +e + c 2 • ke < ck f— J + c 2 • e 1+£ , 

which is bounded by ce 1+£ if k (i.e., fc(e)) is large enough. □ 

Finally, we define a SAT instance i^ ec . Precisely speaking, C™% £ is not a test circuit C x ; 
but C x (u, v) — 1 if and only if the partially assigned circuit (^^(u, t> , — , — ) is satisfiable. 
Hence, the standard transformation from circuits to conjunctive normal form formulas 
(Lemma |3.1|) yields a SAT instance F* ec with the desired property. Furthermore, the size 
of F* ec is almost the same as that of C^t Therefore, the following theorem holds. 

Theorem 2.2. For any e > 0, we can construct SAT instances with 0(£ l+£ ) variables 
and clauses (in the conjunctive normal form) that are as hard as factorizing the product 
of two I bit prime numbers. 

3. Concrete Examples 

Here we examine the applicability of our method with some concrete examples, i.e., the 
cases where i = 30, 40, ... . For such examples, to reduce the size of formulas, we need 
some small techniques different from the previous section; in fact, the recursive application 
of the Chinese Remainder Theorem does not work due to its large constant factor. 

First we state our construction, and then estimate the size of obtained Boolean for- 
mulas. Here we follow the same approach as Section 2; that is, for any x, a product of 
two I bit prime numbers p and q, we first define a test circuit and transform it to a SAT 
instance. We fix x, p, and q in the following discussion. 

The key task is to test whether u x v = x for given u and v. By using the Chinese 
Remainder Theorem, we divide this test into small pieces of similar tests. Since we cannot 
apply the Chinese Remainder Theorem recursively, we would like to divide the test as small 
pieces as possible. For example, we may choose the smallest k prime numbers ei,...,ek 
such that ei + • ■ ■ + > 2£ + k and achieve the test by checking whether Ui x V{ = xi 
(modmj) for all i, 1 < i < k, where = 2 £i — 1, Ui — umodrrii, Vi = vmodm,, and 
Xi = imodm,. Our main idea here is to use m- = 2 £i + 1 as well as rrii = 2 ei — 1. We also 
use m = 2 e ° for some > 1. (In the following, we let u' { = wmodm-, v[ = t/modmj, 
x\ = xmodm-, uq = u mod mo, fo = v mod mo, and xq = imodnio-) 

Note that for any e, one of 2 e — 1 and 2 e + 1 is divisible by 3; but 3 is the largest 
common factor of 2 e ± 1 and 2 e ' ± 1 for any e and e', e ^ e'. Also 2 e is relatively prime 
with any 2 e ' ± 1. 
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Fact 4. For any relatively prime numbers e, e' > 2, gcd(2 e ± 1, 2 e ' ± 1) = 1 or 3. (Clearly, 
gcd(2 e - l,2 e + 1) = 1.) 

We note that the Chinese Remainder Theorem (i.e., Fact |I|) works if 
gcd(m ,mi, ...,rrik,m' 1 , ...,m' k ) > 2 e . Hence, roughly speaking, it is enough to choose 
relatively prime numbers ei, and some eo such that 2(ei + • ■ • + e^) + eo — /clog 3 > 
21. Clearly, this idea enables us to choose smaller modulos. Furthermore, there is another 
advantage of using both rrii = 2 Si — 1 and m! i = 2 £l + 1. As we see below (Claim ||), the 
most of the computation of = umodrrii and u\ = amodmj can be shared, and u[ is 
computable almost as a byproduct of itj. It is also shown (Claim that the multiplication 
cost modulo vn! i is almost the same as the multiplication cost modulo mj. 

To summarize, we choose e , e%, e^ so that gcd(m , m 1 , ...,m k , m[, m' k ) > 2 e , and 
construct C£ ex that tests whether u x v = x for given inputs u and t> in the following way. 

(Step 1) Compute u i} u'^ v iy and v[ for every i, 1 < i < k. (Note that u (resp., v ) is just 
the last eo bits of u (resp., v), and hence, we do not need to compute them.) 

(Step 2) Check whether U{ x Vi = Xi (modmj) and v! i x v[ = x\ (modm^) for every i, 
1 < i < k, and also check whether uq x vq = xq (mod mo). If all of them hold, then 
output 1; otherwise, output 0. 

Now we estimate the size of C£ ex in detail. First we remark on the type of gates used 
in circuits. Though it is standard to construct circuits by using 2-fan-in gates, here we also 
use 3-fan-in gates, since 3-fan-in gates are useful for addition and subtraction. Clearly, 
we can reduce circuit size by using fc-fan-in gates for larger k; but the number of clauses 
in the conjunctive form grows proportionally in 2 k . Here by using 3-fan-in gates, we can 
not only simplify our argument, but also we can reduce the total number of clauses in 
the conjunctive form. In the following, in order to distinguish the number of 2-fan-in and 



3-fan-in gates, we write, e.g., size(C) = | 320 | + 1500, by which we mean that C consists 



of 320 3-fan-in gates and 1500 2-fan-in gates. 

First we state a precise relationship between a circuit C and a SAT instance F trans- 
formed from C by the standard reduction. 

Lemma 3.1. Let C be a circuit with n inputs, s% fan-in-2 gates, and S2 fan-in-3 gates; 
let m = Si + &2- From this C, we can construct a formula F in the extended conjunctive 
form with n + m variables and m clauses that simulates C in the following sense: 

[C(a 1 ,...,a n ) = l] 3u 1 ,...,u m [F(ai,...,a n ,u 1 ,...,u m )=tTue]. 

The formula can be transformed into the 4-conjunctive normal form with at most 4si + 8s2 
clauses. 
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Next we prepare circuits for some basic arithmetic operations. 

Claim 2. The addition of one bit number to e bit number is computable by a circuit 
INC e with size(INC e ) = 2e. We use inc(e) to denote this circuit size. 

Proof. The circuit INC e is defined as Figure 1 below. Here gates with label © are 
exclusive-or gates. □ 




Fig. 1: Circuit INC e 



Claim 3. The addition of two e bit numbers is computable by a circuit ADD e with 



size(ADD e ) = | 2e | . We use add(e) to denote this circuit size. 



Proof. The circuit ADD e is defined as Figure 2 below. Here gates with label C are gates 
computing the current bit from two input bits and a carry. □ 




Fig. 2: Circuit ADD e 



Claim 4. The subtraction of two e bit numbers is computable by a circuit SUB e with 
size(SUB e ) = 2e_. More precisely, SUB e takes two e bit numbers u and v as input, and 
outputs (u — v) mod2 e and c indicating whether u — v > (c = if u — v > 0, and c = 1 
if otherwise). We use sub(e) to denote this circuit size. 
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Claim 5. We can construct a circuit MOD e with the following properties. 

(1) MOD e is an I input and 2e + 1 output circuit. 

(2) On input u, MOD e (w) yields u mod' (2 e — 1) and u mod (2 e + 1) at the first e output 
gates and the last e + 1 gates respectively. 



(3) The size of MOD e is 21 + 2e + 4e + 2i' , where f = £ - (£mode). 



Proof. Let w be I bit number, for which we want compute s = ttmod'(2 e — 1) and 
t = -umod(2 e + 1). Let (uq, ...,Uh-i) be its base 2 e representation. That is, u = uq + 
Ui2 e + u 2 2 2e + • • • + Uh-i2^ h ~^ e , where /i = \£/e] . Here we assume that h — 1 is even and 
ft, — 1 = 2/i' for some h. (The odd case is treated similarly.) Then we have 

s = (u + Ui + u 2 + u 3 h u 2 h') mod' (2 e - 1) 

= ((u + u 2 -\ h u 2h i) + (-ui + -u 3 + h M2/i'-i)) mod' (2 e - 1), and 

t = (u - ui + u 2 - -u 3 H h M 2 ft') mod (2 e + 1) 

= ((u + u 2 H h w 2 ft') - (ui + u 3 H h w 2 ft'-i)) mod' (2 e + 1). 

Note also that for any x, y, < x, y < 2 e , we have 

(x + y) mod' (2 e — 1) = (x + y) mod 2 e + c X)y , and 
(x + y) mod' (2 e + 1) = (x + y) mod 2 e - c^^, 

where c XtV is the (e + l)th bit of x + y, or the eth carry of x + y. 

These observations suggests us to compute the following v + and t>_. 

v + = ((• • • ((u + u 2 ) mod 2 e + u 4 + c 3 ) mod 2 e H ) + m 2 /i' + c 2 h'-i) mod 2 e , and 

v- = ((• • • ((ui + u 3 + c 2 ) mod 2 e + -u 5 + c 4 ) mod 2 e H ) + u 2h < + c 2 ^_ 2 ) mod 2 e , 

where q is the eth carry of the addition of a partial sum o;j_ 2 and Ui + The following 
figure illustrates this computation. 



+ 



c 2 
+ 



M 



u 2 



a 2 



M 4 



C 3 
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Fig. 3: Computation of v+ and f_. 
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Then it is easy to see that s and t are obtained by 



s = (s+ + s_ + C2h') mod2 e + c+, and 
i = (s+ — s_ — C2h') mod2 e + c_, 

where c + and c_ are respectively the eth carry of s + + s_ + c 2 h' and the negative eth carry 

Of S + - S_ - C 2 ft'. 

Our circuit MOD e is defined following this outline. Recall that ADD e can be modified 
with no additional gate for adding two numbers with a carry (Claim [|); the same property 
holds for SUBp. Thus, the size of MOD P is estimated as follows. 



size(MOD e 



(2h' - l)add(e) + add(f') + add(e) + sub(e) + inc(f ) + 2inc(e) 



(h - 2) 2e + 21" + 4e| + 2l' + 4e = 2i + 2e + 4e + 2f . 



Here £" = fmode and £' = I — £". Note that adding U2h' to the partial sum is computed 
with two circuits ADD^» and INCV. □ 



Claim 6. For any e > 1, we can construct a circuit MULT e and MULT' e with the following 
properties. 

(1) MULT e is 2e input and e output circuit, and MULT' e is 2(e + 1) input and e + 1 
output circuit. 

(2) For any pair of input integers u and v , < u, v < 2 e — 1, MULT e computes 
u ■ v mod' (2 e — 1). Similarly, for any pair of input integers u and v , < u, v < 2 e + 1, 
MULTg computes u ■ u mod (2 e + 1). 

(3) The size of MULT e and MULT e are bounded by 1 2(e - l)e | +e 2 +2e and | 2e 2 + e + 1 1 + 
e 2 + 4e respectively. 



Proof. First we consider MULT e . Consider any 
integers u, v, < u, v < 2 e — 1; let (a e _i, a ) 
and (6 e _i, 6o) be binary representations of u 
and f respectively. Intuitively, w = u-vmod (2 e — 
1) is computed as in Figure 4. More specifi- 
cally, it is computed as (1) below. Here Ui = 
(a e _j_i, do, ax,--, d e -i) x Wi that is, each bit of 
Ui is computed as aj A &j. Hence, for computing 
w, we need e — 1 ADD e circuits, one INC e circuit, 
and e 2 AND gates. 



0-e-l a e-2 ' ' ' a 1^0 

x 6 e _i6 e _ 2 - ■ -6160 

a e _2 ■ ■ • aiaocie-i x&i 

+ aoa e _ia e _2 • • • «i x&i 



* Carries are omitted here. 
Fig. 4: u ■ v mod (2 e — 1) 



^ = ((' ' ' {(i u o + Mi) mod2 e + u 2 + ci) mod2 e 



+ u e -i + c e _ 2 )mod2 e , (1) 
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Thus, the size of MULT e is estimated as follows. 



size (MULT, 



(e - l)add(e) + inc(e) + e 2 = 2(e - l)e 



+ e 2 + 2e. 



Next define circuit MULT' e . This time u and/or v can be 2 e . Hence, we need to 
represent them as (a e , a ) and (6 e , 6 ); but let us also consider u' = (a e _i, a ) and 
f ' = (6 e _i, &o)- Then we have 



w 



u ■ v mod (2 e + 1) = (V • v' mod (2 e + 1) - (u" + v") + a e • b e ) mod (2 e + 1), 



where u" — u' ■ b e and v" — v' • a e . 

We first consider how to compute u' ■ f'mod(2 e + 1). Just compute u' ■ v' in the 
standard way, which gives us 2e bit number. Let u>_ and w + denote numbers at the 
first e bits and the last e bits respectively. Then we have u' ■ i/mod(2 e + 1) = (w + — 
W-) mod2 e + c, where c is the negative eth carry of w + — W-. Thus, w is obtained by 
(w + — (w- + u" + v")) mod2 e + c + a e ■ b e . Notice here that at most one of u>_, u", v" 
is nonzero. Hence, W- + u" + v" is computable by bit-wise or, which can be done by e 
3-fan-in OR gates. Similarly, if a e -b e = 1, then the other term for w is zero. Thus, the size 
of our circuit MULT^, which computes w following this outline, is estimated as follows. 

size(MULT' e ) = (# of gates for v! ■ v') + (# of gates for u" and v") 
+(# of gates for W- + u" + v") + sub(e) + inc(e) 
+ (# of gates for +a e ■ b e ) 

= 2(e- l)e| + e 2 + 2e + r^ + n^ + 2e + ^1 

= 2e 2 + e + 1 1 + e 2 + Ae. 



□ 

Now the size of our test circuit C^ cx , which uses these circuits, is estimated as follows. 

Lemma 3.2. The circuit C£ cx outlined above tests whether u ■ v = x for given inputs u 
and v, and we can bound its size as follows, where I is the length of x's prime factors, 
eo, ek are parameters defined above, and £ • = £ — £modej, 1 < i < k. 



size(C,' 



T) 



< 



^](4e 2 + 3ej) + e 2 - e + 4k£ + k 



i=i 



+ J](2e 2 + 16 ei + 2<) + e 2 /2 + e /2 - 2. 



Proof. It follows from the above outline that C£ ex consists of, (i) for each i, 1 < % < k, two 
M0D 6i , one MULT e ^, and one MULTg. circuits, (ii) a circuit for computing u x v mod 2 e °, 
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and (iii) gates for checking every obtained product is equal to Xj. It is not easy to see 

+ (e — l)e /2 gates, and that the 



that a circuit for uq x vq mod 2 e ° requires 



(e - l)e 



□ 



whole equality check can be done with eo — 1 + '^2(2e i — 1) + k — 1 gates. Hence, we have 

i=l 

size(C^ ex ) = ^ (2size(MOD e .) + size(MULT e J + size(MULT' ei 



i=l 
+ 



(e - l)e + (e - l)e /2 + e - 1 + Eti(2e 4 — 1) + A; — 1 



4£ + 4 ei + 2(e; - l)e< + 2e 2 + + 1 



+ 8ei + <• + ef + 2a + e 2 + 4e, 



+ 



(e - l)e + (e - l)e /2 + e - 1 + ^(2e, - 1) + k - 1 



i=i 



^(4e 2 + 3e<) + eg - e + 4k£ + A; 



+ ^(2e, 2 + 16ei + 2$ + eg/2 + e /2 - 2. 



8=1 



Theorem 3.3. For a given x, a product of two i bit prime numbers, we can construct 
a SAT instance F£ ex that is as hard as factorizing x, and that has at most the following 
number of variables, where eo, ■■■,Ck and £[,..., £' k are parameters defined above. 

k 

+ 19e< + 2$ + 3eg/2 - e /2 + 4^ + Jfe + 2£ - 2. 

8=1 

i^ ex has at most this number of clauses in the extended 4-conjunctive form and at most 
£*=i(40e 2 + 88ei + 8£-) + lOeg - 6e + 32H + 8A; - 8 clauses in the 4-conjunctive normal 
form. 

Now we estimate the size of formulas for several concrete cases. For comparison, let us 
also estimate the size of the formula _F^ aive obtained from x by the straightforward reduc- 
tion explained in Introduction. (For our concrete examples, formulas obtained by using 
the FFT become much larger than the ones obtained by the straightforward reduction.) 

Proposition 3.4. For a given x, a product of two £ bit prime numbers, the formula 
F° aive has 3£ 2 + 2£ — 1 variables. It has about this number of clauses in the extended 
4-conjunctive form and at most 20£ 2 — 8£ — 4 clauses in the 4-conjunctive normal form. 

Proof. It is easy to show that the size of the straightforward circuit multiplying two £ 
bit numbers is (£—!)■ add(£) + £ 2 



2(£-l)£ 



+ £ 2 . The test circuit needs 2£ — 1 more 



gates for checking whether the obtained product is equal to x, and thus, its size becomes 
2(£ — 1)£ + £ 2 + 2£ — 1 . Then the above bounds follow from Lemma EO. □ 
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Table 1 below shows the size of F^ ex and _p^ aive obtained from x, a product of two I bit 
prime numbers; that is, solving SAT problem for F£ ex and i^ aive is as hard as factorizing x. 
The column of var.s" is for the number of variables of obtained formulas; hence, it also 
bounds the number of clauses of the formulas in the extended 4-conjunctive form. On the 
other hand, the column of clauses" is for the number of clauses of the formulas in the 
4-conjunctive normal form. For these formulas, the number of clauses in the 4-conjunctive 
normal form is approximately 6 times larger than the number of variables. 



1 


^naive 


rpcex 


# of var.s 


# of clauses 


# of var.s 


# of clauses 


e ,ei, ... 


30 


2,759 


11,756 


2,767 


17,240 


16, 4, 5, 7, 9 


40 


4,879 


31,676 


4,103 


25,728 


16, 7, 8, 9, 11 


50 


7,599 


49,596 


5,657 


35,776 


27, 5, 7, 8, 9, 11 


60 


10,919 


71,516 


7,315 


46,328 


23, 5, 7, 8, 9, 11, 13 


70 


14,839 


97,436 


9,347 


59,448 


27, 5, 7, 9, 11, 13, 16 


128 


49,407 


326,652 


22,165 


142,344 


27, 7, 11, 13, 15, 16, 
17, 19, 23 


256 


197,119 


1,308,668 


63,652 


406,860 


62, 7, 11, 13, 17, 19, 
23, 25, 27, 29, 31, 32 



Table 1: The size of formulas 



Consider first the task of generating test instances for a given SAT algorithm. From 
the view point of the Factorization Problem (FACT), the case i = 30, i.e., factorizing 
a product of two 30 bit primes, is not so difficult. It is solvable in a few minutes by a 
straightforward algorithm on a small workstation. But the problem suddenly becomes 
difficult when i > 40. Thus, those instances generated with t = 40 or i = 50 would 
be quite good examples for testing the performance of SAT algorithms. Note that if we 
use some advanced algorithm like the Quadratic Sieve, factorization up to I = 100 is 
computable in one to two hours on a mid size workstation [ Kob97 ]. But it is hard to 
think of a SAT algorithm incorporating such a specialized algorithm. 

Next analyze the hardness of the SAT by using our knowledge on the hardness of the 
FACT. It has been widely believed (see, e.g., ||Sch94|| ) factorizing 512 bit numbers is hard 
to solve, which is the case I = 256. Now from Table 1, this corresponds via our reduction 
to SAT instances with approximately 63,000 variables. That is, some (in fact many) SAT 
instances with 63,000 variables are intractable. Notice that by the straightforward reduc- 
tion, we cannot show the same hardness unless SAT instances have more than 190,000 
variables. In Table 1, we also estimate the size of SAT instances generated from 256 bit 
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numbers (i.e., i = 128), which are still quite difficult to factorize (i.e., one day task on a 
mid size workstation [[Kob97|D in practice. 
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